27 research outputs found

    Integration of Biological Sources: Exploring the Case of Protein Homology

    Get PDF
    Data integration is a key issue in the domain of bioin- formatics, which deals with huge amounts of heteroge- neous biological data that grows and changes rapidly. This paper serves as an introduction in the field of bioinformatics and the biological concepts it deals with, and an exploration of the integration problems a bioinformatics scientist faces. We examine ProGMap, an integrated protein homology system used by bioin- formatics scientists at Wageningen University, and several use cases related to protein homology. A key issue we identify is the huge manual effort required to unify source databases into a single resource. Un- certain databases are able to contain several possi- ble worlds, and it has been proposed that they can be used to significantly reduce initial integration efforts. We propose several directions for future work where uncertain databases can be applied to bioinformatics, with the goal of furthering the cause of bioinformatics integration

    Comparative analysis indicates that alternative splicing in plants has a limited role in functional expansion of the proteome

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Alternative splicing (AS) is a widespread phenomenon in higher eukaryotes but the extent to which it leads to functional protein isoforms and to proteome expansion at large is still a matter of debate. In contrast to animal species, for which AS has been studied extensively at the protein and functional level, protein-centered studies of AS in plant species are scarce. Here we investigate the functional impact of AS in dicot and monocot plant species using a comparative approach.</p> <p>Results</p> <p>Detailed comparison of AS events in alternative spliced orthologs from the dicot <it>Arabidopsis thaliana </it>and the monocot <it>Oryza sativa </it>(rice) revealed that the vast majority of AS events in both species do not result from functional conservation. Transcript isoforms that are putative targets for the nonsense-mediated decay (NMD) pathway are as likely to contain conserved AS events as isoforms that are translated into proteins. Similar results were obtained when the same comparison was performed between the two more closely related monocot species rice and <it>Zea mays </it>(maize).</p> <p>Genome-wide computational analysis of functional protein domains encoded in alternatively and constitutively spliced genes revealed that only the RNA recognition motif (RRM) is overrepresented in alternatively spliced genes in all species analyzed. In contrast, three domain types were overrepresented in constitutively spliced genes. AS events were found to be less frequent within than outside predicted protein domains and no domain type was found to be enriched with AS introns. Analysis of AS events that result in the removal of complete protein domains revealed that only a small number of domain types is spliced-out in all species analyzed. Finally, in a substantial fraction of cases where a domain is completely removed, this domain appeared to be a unit of a tandem repeat.</p> <p>Conclusion</p> <p>The results from the ortholog comparisons suggest that the ability of a gene to produce more than one functional protein through AS does not persist during evolution. Cross-species comparison of the results of the protein-domain oriented analyses indicates little correspondence between the analyzed species. Based on the premise that functional genetic features are most likely to be conserved during evolution, we conclude that AS has only a limited role in functional expansion of the proteome in plants.</p

    Genetic complexity of miscanthus cell wall composition and biomass quality for biofuels

    Get PDF
    BACKGROUND: Miscanthus sinensis is a high yielding perennial grass species with great potential as a bioenergy feedstock. One of the challenges that currently impedes commercial cellulosic biofuel production is the technical difficulty to efficiently convert lignocellulosic biomass into biofuel. The development of feedstocks with better biomass quality will improve conversion efficiency and the sustainability of the value-chain. Progress in the genetic improvement of biomass quality may be substantially expedited by the development of genetic markers associated to quality traits, which can be used in a marker-assisted selection program. RESULTS: To this end, a mapping population was developed by crossing two parents of contrasting cell wall composition. The performance of 182 F1 offspring individuals along with the parents was evaluated in a field trial with a randomized block design with three replicates. Plants were phenotyped for cell wall composition and conversion efficiency characters in the second and third growth season after establishment. A new SNP-based genetic map for M. sinensis was built using a genotyping-by-sequencing (GBS) approach, which resulted in 464 short-sequence uniparental markers that formed 16 linkage groups in the male map and 17 linkage groups in the female map. A total of 86 QTLs for a variety of biomass quality characteristics were identified, 20 of which were detected in both growth seasons. Twenty QTLs were directly associated to different conversion efficiency characters. Marker sequences were aligned to the sorghum reference genome to facilitate cross-species comparisons. Analyses revealed that for some traits previously identified QTLs in sorghum occurred in homologous regions on the same chromosome. CONCLUSION: In this work we report for the first time the genetic mapping of cell wall composition and bioconversion traits in the bioenergy crop miscanthus. These results are a first step towards the development of marker-assisted selection programs in miscanthus to improve biomass quality and facilitate its use as feedstock for biofuel production

    FLC and SVP Are Key Regulators of Flowering Time in the Biennial/Perennial Species Noccaea caerulescens

    No full text
    The appropriate timing of flowering is crucial for plant reproductive success. Studies of the molecular mechanism of flower induction in the model plant Arabidopsis thaliana showed long days and vernalization as major environmental promotive factors. Noccaea caerulescens has an obligate vernalization requirement that has not been studied at the molecular genetics level. Here, we characterize the vernalization requirement and response of four geographically diverse biennial/perennial N. caerulescens accessions: Ganges (GA), Lellingen (LE), La Calamine (LC), and St. Felix de Pallières (SF). Differences in vernalization responsiveness among accessions suggest that natural variation for this trait exists within N. caerulescens. Mutants which fully abolish the vernalization requirement were identified and were shown to contain mutations in the FLOWERING LOCUS C (NcFLC) and SHORT VEGETATIVE PHASE (NcSVP) genes, two key floral repressors in this species. At high temperatures, the non-vernalization requiring flc-1 mutant reverts from flowering to vegetative growth, which is accompanied with a reduced expression of LFY and AP1. This suggested there is “crosstalk” between vernalization and ambient temperature, which might be a strategy to cope with fluctuations in temperature or adopt a more perennial flowering attitude and thus facilitate a flexible evolutionary response to the changing environment across the species range.</p

    Assessing the contribution of alternative splicing to proteome diversity in <it>Arabidopsis thaliana </it>using proteomics data

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Large-scale analyses of genomics and transcriptomics data have revealed that alternative splicing (AS) substantially increases the complexity of the transcriptome in higher eukaryotes. However, the extent to which this complexity is reflected at the level of the proteome remains unclear. On the basis of a lack of conservation of AS between species, we previously concluded that AS does not frequently serve as a mechanism that enables the production of multiple functional proteins from a single gene. Following this conclusion, we hypothesized that the extent to which AS events contribute to the proteome diversity in <it>Arabidopsis thaliana </it>would be lower than expected on the basis of transcriptomics data. Here, we test this hypothesis by analyzing two large-scale proteomics datasets from <it>Arabidopsis thaliana</it>.</p> <p>Results</p> <p>A total of only 60 AS events could be confirmed using the proteomics data. However, for about 60% of the loci that, based on transcriptomics data, were predicted to produce multiple protein isoforms through AS, no isoform-specific peptides were found. We therefore performed <it>in silico </it>AS detection experiments to assess how well AS events were represented in the experimental datasets. The results of these <it>in silico </it>experiments indicated that the low number of confirmed AS events was the consequence of a limited sampling depth rather than <it>in vivo </it>under-representation of AS events in these datasets.</p> <p>Conclusion</p> <p>Although the impact of AS on the functional properties of the proteome remains to be uncovered, the results of this study indicate that AS-induced diversity at the transcriptome level is also expressed at the proteome level.</p

    Biological process annotation of proteins across the plant kingdom

    Get PDF
    Accurate annotation of protein function is key to understanding life at the molecular level, but automated annotation of functions is challenging. We here demonstrate the combination of a method for protein function annotation that uses network information to predict the biological processes a protein is involved in, with a sequence-based prediction method. The combined function prediction is based on co-expression networks and combines the network-based prediction method BMRF with the sequence-based prediction method Argot2. The combination shows significantly improved performance compared to each of the methods separately, as well as compared to Blast2GO. The approach was applied to predict biological processes for the proteomes of rice, barrel clover, poplar, soybean and tomato. The novel function predictions are available at www.ab.wur.nl/bmrf. Analysis of the relationships between sequence similarity and predicted function similarity identifies numerous cases of divergence of biological processes in which proteins are involved, in spite of sequence similarity. This indicates that the integration of network-based and sequence-based function prediction is helpful towards the analysis of evolutionary relationships. Examples of potential divergence are identified for various biological processes, notably for processes related to cell development, regulation, and response to chemical stimulus. Such divergence in biological process annotation for proteins with similar sequences should be taken into account when analyzing plant gene and genome evolution. DATA: All gene functions predictions are available online (http://www.ab.wur.nl/bmrf/). The online resource can be queried for predictions of proteins or for Gene Ontology terms of interest, and the results can be downloaded in bulk. Queries can be based on protein identifiers, biological process Gene Ontology identifiers, or text descriptors of biological processes

    Low Temperature Affects Stem Cell Maintenance in Brassica oleracea Seedlings

    Get PDF
    Most of the above ground tissues in higher plants originate from stem cells located in the shoot apical meristem (SAM). Several plant species can suffer from spontaneous stem cell arrest resulting in lack of further shoot development. In Brassica oleracea this SAM arrest is known as blindness and occurs in an unpredictable manner leading to considerable economic losses for plant raisers and farmers. Detailed analyses of seedlings showed that stem cell arrest is triggered by low temperatures during germination. To induce this arrest reproducibly and to study the effect of the environment, an assay was developed. The role of genetic variation on the susceptibility to develop blind seedlings was analyzed by a quantitative genetic mapping approach, using seeds from a double haploid population from a cross between broccoli and Chinese kale, produced at three locations. The analysis revealed, besides an effect of the seed production location, a region on linkage group C3 associated with blindness sensitivity. A subsequent dynamic genome-wide transcriptome analysis resulted in the identification of around 3000 differentially expressed genes early after blindness induction. A large number of cell cycle genes were en masse induced early during the development of blindness, whereas shortly after, all were down-regulated. This miss-regulation of core cell cycle genes is accompanied with a strong reduction of cells reaching the DNA replication phase. From the differentially expressed genes, 90 were located in the QTL region C3. Among them are two genes belonging to the MINICHROMOSOMAL MAINTENANCE gene family, known to be involved in DNA replication, a RETINOBLASTOMA-RELATED gene, a key regulator for cell cycle initiation, and several MutS homologs genes, involved in DNA repair. These genes are potential candidates for being involved in the development of blindness in Brassica oleracea sensitive genotypes
    corecore